Improved n-gram phonotactic models for language recognition

نویسندگان

  • Mohamed Faouzi BenZeghiba
  • Jean-Luc Gauvain
  • Lori Lamel
چکیده

This paper investigates various techniques to improve the estimation of n-gram phonotactic models for language recognition using single-best phone transcriptions and phone lattices. More precisely, we first report on the impact of the so-called acoustic scale factor on the system accuracy when using latticebased training, and then we report on the use of n-gram cutoff and entropy pruning techniques. Several system configurations are explored, such as the use of context-independent and context-dependent phone models, the use of single-best phone hypotheses versus phone lattices, and the use of various n-gram orders. Experiments are conducted using the LRE 2007 evaluation data and the results are reported using the a posteriori EER. The results show that the impact of these techniques on the system accuracy is highly dependent on the training conditions and that careful optimization can lead to performance improvements.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Language modeling of Chinese personal names based on character units for continuous Chinese speech recognition

In this paper, we analyze Chinese personal names to model their statistical phonotactic characteristics for continuous Chinese speech recognition. The analysis showed languagespecific characteristics of Chinese personal names and strongly suggested the advantage of character-unit oriented modeling. A hierarchical language model was composed by reflecting statistical phonotactic characteristics ...

متن کامل

Selecting phonotactic features for language recognition

This paper studies feature selection in phonotactic language recognition. The phonotactic feature is presented by n-gram statistics derived from one or more phone recognizers in the form of high dimensional feature vectors. Two feature selection strategies are proposed to select the n-gram statistics for reducing the dimension of feature vectors, so that higher order n-gram features can be adop...

متن کامل

iVector Approach to Phonotactic Language Recognition

This paper addresses a novel technique for representation and processing of n-gram counts in phonotactic language recognition (LRE): subspace multinomial modelling represents the vectors of n-gram counts by low dimensional vectors of coordinates in total variability subspace, called iVector. Two techniques for iVector scoring are tested: support vector machines (SVM), and logistic regression (L...

متن کامل

Phonotactic language recognition based on time-gap-weighted lattice kernels

Phonotactic method for spoken language recognition (SLR) deals with permissible phone patterns and their frequencies of occurrence in a specific language. Phone recognizers followed by vector space models (PR-VSM) system is a state-of-the-art phonotactic language identification system, in which any utterance can be mapped into a supervector filled with likelihood scores of the n-gram tokens (ba...

متن کامل

Using cross-decoder co-occurrences of phone n-grams in SVM-based phonotactic language recognition

Most common approaches to phonotactic language recognition deal with several independent phone decoders. Decodings are processed and scored in a fully uncoupled way, their time alignment (and the information that may be extracted from it) being completely lost. Recently, we have presented a new approach to phonotactic language recognition which takes into account time alignment information, by ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010